Anna Rohrbach

35 publications

11 venues

H Index 16

Affiliation

TU Darmstadt, Germany
University of California, Berkeley, CA, USA
Max Planck Institute for Informatics, Saarbr cken, Germany

Links

Name	Venue	Year	citations
MammalNet: A Large-Scale Video Benchmark for Mammal Recognition and Behavior Understanding.	CVPR	2023	0
Using Language to Extend to Unseen Domains.	ICLR	2023	0
Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly.	ECCV	2022	2
On Guiding Visual Attention with Language Specification.	CVPR	2022	2
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension.	ACL	2022	12
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning.	ECCV	2022	5
TL;DW? Summarizing Instructional Videos with Task Relevance and Cross-Modal Saliency.	ECCV	2022	0
DETReg: Unsupervised Pretraining with Region Priors for Object Detection.	CVPR	2022	0
Object-Region Video Transformers.	CVPR	2022	0
K-LITE: Learning Transferable Visual Models with External Knowledge.	NIPS/NeurIPS	2022	0
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens.	NIPS/NeurIPS	2022	0
How Much Can CLIP Benefit Vision-and-Language Tasks?	ICLR	2022	0
NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media.	EMNLP	2021	16
CLIP-It! Language-Guided Video Summarization.	NIPS/NeurIPS	2021	23
Compositional Video Synthesis with Action Graphs.	ICML	2021	0
Identity-Aware Multi-sentence Video Description.	ECCV	2020	6
Advisable Learning for Self-Driving Vehicles by Internalizing Observation-to-Action Rules.	CVPR	2020	17
Robust Change Captioning.	ICCV	2019	57
Language-Conditioned Graph Networks for Relational Reasoning.	ICCV	2019	116
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation.	ACL	2019	70
Adversarial Inference for Multi-Sentence Video Description.	CVPR	2019	0
Women Also Snowboard: Overcoming Bias in Captioning Models.	ECCV	2018	324
Multimodal Explanations: Justifying Decisions and Pointing to the Evidence.	CVPR	2018	287
Speaker-Follower Models for Vision-and-Language Navigation.	NIPS/NeurIPS	2018	307
Textual Explanations for Self-Driving Vehicles.	ECCV	2018	155
Video Object Segmentation with Language Referring Expressions.	ACCV	2018	67
Object Hallucination in Image Captioning.	EMNLP	2018	126
Fooling Vision and Language Models Despite Localization and Attention Mechanism.	CVPR	2018	0
Generating Descriptions with Grounded and Co-referenced People.	CVPR	2017	54
Gradient-free Policy Architecture Search and Adaptation.	CoRL	2017	24
A Dataset and Exploration of Models for Understanding Video Data through Fill-in-the-Blank Question-Answering.	CVPR	2017	0
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding.	EMNLP	2016	1218
Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags.	AAAI	2016	30
Grounding of Textual Phrases in Images by Reconstruction.	ECCV	2016	0
A dataset for Movie Description.	CVPR	2015	360